52 research outputs found

    Cross-calibration of Time-of-flight and Colour Cameras

    Get PDF
    Time-of-flight cameras provide depth information, which is complementary to the photometric appearance of the scene in ordinary images. It is desirable to merge the depth and colour information, in order to obtain a coherent scene representation. However, the individual cameras will have different viewpoints, resolutions and fields of view, which means that they must be mutually calibrated. This paper presents a geometric framework for this multi-view and multi-modal calibration problem. It is shown that three-dimensional projective transformations can be used to align depth and parallax-based representations of the scene, with or without Euclidean reconstruction. A new evaluation procedure is also developed; this allows the reprojection error to be decomposed into calibration and sensor-dependent components. The complete approach is demonstrated on a network of three time-of-flight and six colour cameras. The applications of such a system, to a range of automatic scene-interpretation problems, are discussed.Comment: 18 pages, 12 figures, 3 table

    Automatic Detection of Calibration Grids in Time-of-Flight Images

    Get PDF
    It is convenient to calibrate time-of-flight cameras by established methods, using images of a chequerboard pattern. The low resolution of the amplitude image, however, makes it difficult to detect the board reliably. Heuristic detection methods, based on connected image-components, perform very poorly on this data. An alternative, geometrically-principled method is introduced here, based on the Hough transform. The projection of a chequerboard is represented by two pencils of lines, which are identified as oriented clusters in the gradient-data of the image. A projective Hough transform is applied to each of the two clusters, in axis-aligned coordinates. The range of each transform is properly bounded, because the corresponding gradient vectors are approximately parallel. Each of the two transforms contains a series of collinear peaks; one for every line in the given pencil. This pattern is easily detected, by sweeping a dual line through the transform. The proposed Hough-based method is compared to the standard OpenCV detection routine, by application to several hundred time-of-flight images. It is shown that the new method detects significantly more calibration boards, over a greater variety of poses, without any overall loss of accuracy. This conclusion is based on an analysis of both geometric and photometric error.Comment: 11 pages, 11 figures, 1 tabl

    View-based approaches to spatial representation in human vision

    Get PDF
    In an immersive virtual environment, observers fail to notice the expansion of a room around them and consequently make gross errors when comparing the size of objects. This result is difficult to explain if the visual system continuously generates a 3-D model of the scene based on known baseline information from interocular separation or proprioception as the observer walks. An alternative is that observers use view-based methods to guide their actions and to represent the spatial layout of the scene. In this case, they may have an expectation of the images they will receive but be insensitive to the rate at which images arrive as they walk. We describe the way in which the eye movement strategy of animals simplifies motion processing if their goal is to move towards a desired image and discuss dorsal and ventral stream processing of moving images in that context. Although many questions about view-based approaches to scene representation remain unanswered, the solutions are likely to be highly relevant to understanding biological 3-D vision

    Filter Transformations for Shift-Insensitive Feature Detection

    Get PDF
    International audienceThe representation of oriented image-structure is an important part of most biological vision models. It is possible, for example, to estimate both motion and binocular disparity from the responses of oriented filters (Adelson & Bergen 1985, JOSA A 2(2), 284-299). It is particularly useful to combine the responses of different filters, in order to obtain a response to edge-like structures that is insensitive to slight shifts (in the direction perpendicular to the edge). It has been hypothesized that complex cells achieve this by separating the local energy of the signal from its phase. We describe an alternative approach, which is based on the 'local jet' representation (Koenderink & van Doorn 1987, Biol. Cyb. 55, 367-375). Each jet is computed from a set of oriented derivative filters, of order 1 to N, which are applied at a given image location. We show that these filters can be used as a basis for a new set, which contains filters of a single order, each at a slightly different location. The maximum response, over the new set, is insensitive to small image-shifts. This approach can be justified by noting that a Taylor approximation of the shifted Kth order filter can be obtained from the N-K higher-order filters in the jet. It is shown, however, that a least-squares construction is more practical. Finally, it is noted that the responses of the new filters can be obtained from a linear transformation of the original N image derivatives

    Detection and Localization of 3D Audio-Visual Objects Using Unsupervised Clustering

    Get PDF
    International audienceThis paper addresses the issues of detecting and localizing objects in a scene that are both seen and heard. We explain the benefits of a human-like configuration of sensors (binaural and binocular) for gathering auditory and visual observations. It is shown that the detection and localization problem can be recast as the task of clustering the audio-visual observations into coherent groups. We propose a probabilistic generative model that captures the relations between audio and visual observations. This model maps the data into a common audio-visual 3D representation via a pair of mixture models. Inference is performed by a version of the expectationmaximization algorithm, which is formally derived, and which provides cooperative estimates of both the auditory activity and the 3D position of each object. We describe several experiments with single- and multiple-speaker detection and localization, in the presence of other audio sources
    • …
    corecore